25 results found.
Language Type:
Multilingual
Languages:
Hebrew
Availability:
Freely Available
License:
<Not Specified>
Size:
500 KByte Production Status:
Newly created-finished
Use:
Machine Learning
Paper:
N/A
Documentation:
<Not Specified>Language Type:
Multilingual
Languages:
Hebrew
Availability:
Freely Available
License:
opensource
Size:
<Not Specified> Production Status:
Newly created-finished
Use:
Summarisation
Paper:
N/A
Documentation:
noneLanguage Type:
Multilingual
Languages:
Hebrew
Availability:
Freely Available
License:
Gnu
Size:
100 KByte Production Status:
Newly created-finished
Use:
<Not Specified>
Paper:
N/A
Documentation:
<Not Specified>
Written
Treebank,
Language Type:
Monolingual
Languages:
Afrikaans Akkadian Amharic Ancient Greek Arabic Armenian Assyrian Bambara Basque Belarusian Bhojpuri Breton Bulgarian Buryat Cantonese Catalan Chinese Classical Chinese Coptic Croatian Czech Danish Dutch English Erzya Estonian Faroese Finnish French Galician German Gothic Greek Hebrew Hindi Hindi English Hungarian Indonesian Irish Italian Japanese Karelian Kazakh Komi Permyak Komi Zyrian Korean Kurmanji Latin Latvian Lithuanian Livvi Maltese Marathi Mbya Guarani Moksha Naija North Sami Norwegian Old Church Slavonic Old French Old Russian Persian Polish Portuguese Romanian Russian Sanskrit Scottish Gaelic Serbian Skolt Sami Slovak Slovenian Spanish Swedish Swedish Sign Language Swiss German Tagalog Tamil Telugu Thai Turkish Ukrainian Upper Sorbian Urdu Uyghur Vietnamese Warlpiri Welsh Wolof Yoruba
Availability:
Freely Available
License:
Various
Size:
25 million words Production Status:
Existing-updated
Use:
Parsing and Tagging
-
Paper title:Universal Dependencies v2: An Evergrowing Multilingual Treebank Collection
-
Paper track:Written/oral presentation
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Joakim Nivre | Universal Dependencies | /N |
Documentation:
https://universaldependencies.org
Written
Corpus,
Language Type:
Multilingual
Languages:
Arabic Bulgarian Catalan Croatian Czech Danish Dutch English Estonian Filipino Finnish French German Greek Hebrew Hindi Hungarian Indonesian Italian Japanese Korean Latvian Lithuanian Malay Norwegian Persian Polish Portuguese Romanian Russian Serbian Simplified Chinese Slovak Slovenian Spanish Swedish Thai Traditional Chinese Turkish Ukrainian Vietnamese
Availability:
Freely Available
License:
CC-BY-SA
Size:
60 GByte Production Status:
Newly created-in progress
Use:
Language Modelling
-
Paper title:Wiki-40B: Multilingual Language Model Dataset
-
Paper track:Written/oral presentation
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Rami Al-Rfou | Wiki40B-LM | /N |
Documentation:
None
Written
Corpus,
Language Type:
Monolingual
Languages:
Afrikaans Albanian Arabic Armenian Bangla Basque Bosnian Breton Bulgarian Catalan Croatian Czech Danish Dutch English Esperanto Estonian Filipino Finnish French Galician Georgian German Greek Hebrew Hindi Hungarian Icelandic Indonesian Italian Japanese Kazakh Korean Latvian Lithuanian Macedonian Malay Malayalam Norwegian Persian Polish Portuguese Romanian Russian Serbian Sinhala Slovak Slovenian Spanish Swedish Tamil Telugu Thai Turkish Ukrainian Urdu Vietnamese pt_br ze_en ze_zh zh_cn zh_tw
Availability:
Freely Available
License:
<Not Specified>
Size:
22.10G tokens Production Status:
Existing-used
Use:
Machine Translation, SpeechToSpeech Translation
-
Paper title:word2word: A Collection of Bilingual Lexicons for 3,564 Language Pairs
-
Paper track:Written/oral presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Yo Joong Choe | OpenSubtitles2018 | /N |
Documentation:
Yes, on the website.
Written
Lexicon,
Language Type:
Monolingual
Languages:
Afrikaans Albanian Arabic Armenian Bangla Basque Bosnian Breton Bulgarian Catalan Croatian Czech Danish Dutch English Esperanto Estonian Filipino Finnish French Galician Georgian German Greek Hebrew Hindi Hungarian Icelandic Indonesian Italian Japanese Kazakh Korean Latvian Lithuanian Macedonian Malay Malayalam Norwegian Persian Polish Portuguese Romanian Russian Serbian Sinhala Slovak Slovenian Spanish Swedish Tamil Telugu Thai Turkish Ukrainian Urdu Vietnamese pt_br ze_en ze_zh zh_cn zh_tw
Availability:
Freely Available
License:
CreativeCommons Attribution 4.0 International
Size:
41 GByte Production Status:
Newly created-finished
Use:
Machine Translation, SpeechToSpeech Translation
-
Paper title:word2word: A Collection of Bilingual Lexicons for 3,564 Language Pairs
-
Paper track:Written/oral presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Yo Joong Choe | word2word | /N |
Documentation:
Yes, on the website.
Written
Corpus,
Language Type:
Monolingual
Languages:
Adyghe Ancient Greek Anglo-Norman Arabic Asturian Azerbaijani Bangla Bashkir Belarusian Breton Bulgarian Catalan Central Kurdish Church Slavic Classical Armenian Classical Syriac Cornish Crimean Tatar Danish English Estonian Faroese Finnish Friulian Galolen Gothic Haida Hebrew Hindi Hungarian Ingrian Irish Italian Kabardian Kalaallisut Kannada Karelian Kashubian Kazakh Khakas Khaling Ladin Latin Latvian Lithuanian Livonian Livvi Low German Lower Sorbian Ludian Maltese Manx Mapuche Middle French Middle High German Middle Low German Murrinh-Patha Navajo Neapolitan No linguistic content Northern Frisian Northern Kurdish Northern Sami Norwegian Bokmål Norwegian Nynorsk Occitan Old English Old French Old Irish Old Saxon Pashto Polish Portuguese Quechua Russian Sanskrit Scottish Gaelic Serbian (Latin) Slovenian Spanish Swahili (Congo - Kinshasa) Swedish Tajik Tatar Telugu Turkish Turkmen Ukrainian Urdu Uzbek Venetian Veps Votic Western Frisian Yiddish Zulu bod ces cym deu ell eus fas fra hye isl kat mkd nld ron sqi
Availability:
Freely Available
License:
CC BY-SA 3.0
Size:
None Production Status:
Existing-updated
Use:
Morphological Analysis
-
Paper title:UniMorph 3.0: Universal Morphology
-
Paper track:Infrastructural Issues/Large Projects/oral presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Ekaterina Vylomova | UniMorph | /N |
Documentation:
None
Written
Corpus,
Language Type:
Monolingual
Languages:
'Auhelawa Abau Abidji Abu' Arapesh Abun Achagua Achang Achi Achinese Achuar-Shiwiar Aché Acoli Adamawa Fulfulde Adele Adhola Adi Adilabad Gondi Adioukrou Adzera Aekyom Afrikaans Agarabi Aghul Aguacateco Aguaruna Agusan Manobo Agutaynen Aimol Ainu Ajië Ajyíninka Apurucayali Akan Akawaio Akebu Akeu Akha Akoose Akukem Alacatlatzala Mixtec Alamblak Alangan Albanian Alekano Algonquin Alladian Alune Alur Alyawarr Ama (Papua New Guinea) Amanab Amarakaeri Amarasi Amatlán Zapotec Ambai Ambo-Pasco Quechua Ambulas Amele Amganad Ifugao Amharic Amri Karbi Anal Ancient Greek Ancient Hebrew Aneme Wake Angaataha Angal Heneng Angami Naga Angguruk Yali Angloromani Angor Anindilyakwa Anjam Ankave Anmatyerre Anufo Anuki Anyin Ao Naga Apalaí Apasco-Apoala Mixtec Apatani Apinayé Apurinã Arabela Aralle-Tabulahan Arapaho Are Arifama-Miniafia Aringa Aromanian Arop-Lokep Arosi Aruamu Asháninka Ashéninka Pajonal Ashéninka Perené Assamese Assyrian Neo-Aramaic Ata Manobo Atatláhuca Mixtec Au Avaric Avatime Avokaya Awa (Papua New Guinea) Awa-Cuaiquer Awadhi Awara Awiyaana Ayacucho Quechua Ayautla Mazatec Ayutla Mixtec Baatonum Baba Malay Babanki Bada (Indonesia) Baeggu Baelelea Bafia Bafut Baga Sitemu Bahnar Baka (South Sudan) Bakairí Bakwé Balangao Balantak Balinese Balkan Romani Baltic Romani Bambam Bambara Bana Bandial Banggai Bangla Bantoanon Baoulé Barai Barasana-Eduria Bargam Bari Bariai Baruya Bashkir Bassari Batad Ifugao Batak Angkola Batak Dairi Batak Karo Batak Simalungun Batak Toba Bauzi Bawm Chin Beaver Bedjond Beembe Bekwarra Belarusian Belize Kriol English Bemba Benabena Berik Berom Besoa Bete-Bendi Biak Biali Biangai Biatah Bidayuh Biete Bimin Bimoba Binandere Bine Binukid Binumarien Bislama Bissa Bisu Bo-Ung Boko (Benin) Bokobaru Bola Bolinao Bomu Bora Border Kuna Borei Borgu Fulfulde Borong Botolan Sambal Breton Bribri Brooke's Point Palawano Buamu Bugawac Bughotu Buginese Buglere Buhid Bukiyip Bulgarian Buli (Ghana) Bulu Bum Bumbita Arapesh Bunama Burarra Busa Bwanabwana Cabécar Cacua Cajamarca Quechua Cajonos Zapotec Calamian Tagbanwa Caluyanun Caló Cameroon Mambila Camsá Candoshi-Shapra Cantonese Capanahua Caquinte Car Nicobarese Carapana Carib Caribbean Hindustani Caribbean Javanese Carpathian Romani Carrier Cashibo-Cacataibo Cashinahua Casiguran Dumagat Agta Catalan Cavineña Cañar Highland Quichua Cebuano Central Aymara Central Bikol Central Bontok Central Cagayan Agta Central Dusun Central Huasteca Nahuatl Central Kurdish Central Malay Central Mazahua Central Mnong Central Ojibwa Central Sama Central Siberian Yupik Central Subanen Central Tunebo Central Yupik Central-Eastern Niger Fulfulde Cerma Chachi Chadian Arabic Chakma Chamacoco Chamorro Chang Naga Chavacano Chayahuita Chayuco Mixtec Chechen Cherokee Chhattisgarhi Chicahuaxtla Triqui Chichicapan Zapotec Chimborazo Highland Quichua Chipaya Chiquihuitlán Mazatec Chiquitano Chiru Chittagonian Choapan Zapotec Choctaw Chokri Naga Chol Chopi Chortí Chothe Naga Chuave Chuj Chumburung Church Slavic Chuukese Chuvash Chácobo Cishingini Classical Syriac Coatecas Altas Zapotec Coatlán Mixe Coatzospan Mixtec Cofán Cogui Colorado Comaltepec Chinantec Copainalá Zoque Copala Triqui Coptic Cornish Cotabato Manobo Coyutla Totonac Crimean Tatar Croatian Cubeo Cuiba Culina Cusco Quechua Da'a Kaili Daasanach Daba Dadibi Daga Dan Dangaléat Dangaura Tharu Danish Dano Darlong Datooga Dawawa Dawro Dedua Deg Dela-Oenale Delo Dendi (Benin) Denya Desano Dhangu-Djangu Dhao Dibabawon Manobo Didinga Digo Dii Dimasa Ditammari Diuxi-Tilantongo Mixtec Divehi Djambarrpuyngu Djimini Senoufo Dobu Dogosé Dogrib Doromu-Koki Doyayo Dupaninan Agta Duri Duruma Dyula Dzongkha East Ambae East Kewa Eastern Apurímac Quechua Eastern Arrernte Eastern Bolivian Guaraní Eastern Bontok Eastern Bru Eastern Canadian Inuktitut Eastern Highland Chatino Eastern Highland Otomi Eastern Huasteca Nahuatl Eastern Karaboro Eastern Khumi Chin Eastern Krahn Eastern Mari Eastern Maroon Creole Eastern Oromo Eastern Penan Eastern Tamang Eastern Tawbuid Edolo Egyptian Arabic Eipomek Ejagham Ekajuk El Nayar Cora Enga English Enxet Epena Erzya Ese Ese Ejja Esperanto Estado de México Otomi Estonian Even Ewage-Notu Ewe Ewondo Ezaa Faiwol Falam Chin Fanamaket Far Western Muria Faroese Fasu Fataleka Fiji Hindi Fijian Filipino Filipino Finnish Folopa Fon Fordata Fore Frafra Francisco León Zoque Ga'dang Gagauz Galela Galo Gamo Ganda Gangte Gapapaiwa Garifuna Garo Garrwa Gbagyi Gela Gen Ghayavi Gheg Albanian Ghomala Gidar Gikyode Gilaki Gilbertese Girawa Giryama Gitonga Goan Konkani Gofa Gogo Gokana Golin Gonja Gor Gorontalo Gourmanchéma Guahibo Guajajára Guambiano Guanano Guarayu Guayabero Gude Guerrero Amuzgo Guerrero Nahuatl Guhu-Samane Guinea Kpelle Gujarati Gulay Gumatj Gumuz Gungu Gwahatike Gwere Gwichʼin Haitian Creole Hakha Chin Hakka Chinese Halh Mongolian Halia Hamer-Banna Hanga Hanga Hundi Hanunoo Haruai Hausa Hawai'i Creole English Hawaiian Haya Hdi Hebrew Hehe Helong Highland Oaxaca Chontal Highland Popoluca Highland Puebla Nahuatl Highland Totonac Hiligaynon Hindi Hiri Motu Hixkaryána Hmar Hmong Daw Hmong Njua Hopi Hote Hrangkhol Huallaga Huánuco Quechua Huamalíes-Dos de Mayo Huánuco Quechua Huambisa Huarijio Huastec Huautla Mazatec Huaylas Ancash Quechua Huaylla Wanca Quechua Huehuetla Tepehua Huichol Huli Hungarian Iamalele Iatmul Iban Ibatan Iduna Ifè Igbo Ignaciano Ika Ikwere Ikwo Ila Ilianen Manobo Iloko Imbabura Highland Quichua Imbongu Inabaknon Indonesian Inga Inoke-Yate Inpui Naga Inuktitut Ipili Iranian Persian Iraqw Iraya Irish Islander Creole English Isnag Isthmus Mixe Isthmus Zapotec Isthmus-Mecayapan Nahuatl Italian Itawit Itelmen Iu Mien Ivatan Ivbie North-Okpela-Arhe Iwal Ixil Iyo Iyo'wujwa Chorote Iyojwa'ja Chorote Izere Izii Jalapa De Díaz Mazatec Jamaican Creole English Jamamadí Jamiltepec Mixtec Japanese Jarai Javanese Jola-Fonyi Jola-Kasa Juang Jukun Takum Juquila Mixe Jur Modo Jèrriais Kaansa Kaba Kabiyè Kabyle Kachin Kadiwéu Kafa Kagayanen Kagulu Kahua Kaingang Kaiwá Kakabai Kako Kakwa Kala Lagaw Ya Kalaallisut Kalagan Kalam Kalam-Tai Kalanga Kalenjin Kalmyk Kaluli Kamano Kamasau Kambaata Kamula Kamwe Kanasi Kandas Kandawo Kaningdon-Nindem Kaninuwa Kanite Kankanaey Kannada Kapingamarangi Kaqchikel Kara (Papua New Guinea) Kara-Kalpak Karachay-Balkar Karajá Karamojong Karbi Karelian Karkar-Yuri Karon Kasem Kasua Kayabí Kayapó Kazakh Keapara Kein Kekchí Kele (Democratic Republic of Congo) Keley-I Kallahan Keliko Kenga Kenyang Kera Keyagana Khakas Khasi Khehek Khiamniungan Naga Khmer Khumi Chin Kikuyu Kilivila Kim Kimré Kinaray-a Kinyarwanda Kire Kisar Kituba (Democratic Republic of Congo) Klingon Kobon Kok Borok Kom (India) Koma Komba Kombio Komi-Zyrian Konai Konkomba Konni Kono (Sierra Leone) Konso Konyak Naga Koongo Koonzime Koorete Korafe-Yegha Korean Koreguaje Koronadal Blaan Korupun-Sela Kosena Kouya Koyra Chiini Koyraboro Senni Krio Kriol Kuanua Kuanyama Kube Kukele Kuku-Yalanji Kulung (Nepal) Kumam Kuman (Papua New Guinea) Kumyk Kunda Kuni-Boazi Kunimaipa Kuo Kuot Kupang Malay Kupsabiny Kuranko Kuria Kurti Kurukh Kusaal Kutep Kutu Kuwaa Kuwaataay Kwaio Kwamera Kwanga Kwara'ae Kwere Kwoma Kyaka Kyenele Kâte Kʼicheʼ Laari Label Labuk-Kinabatangan Kadazan Lacandon Lachixío Zapotec Ladakhi Lahu Lahu Shi Laka (Chad) Lalana Chinantec Lama (Togo) Lamba Lambayeque Quechua Lambya Lamkang Lampung Api Lango (Uganda) Lao Lashi Latin Latvian Lauje Lealao Chinantec Ledo Kaili Lega-Mwenga Lele (Chad) Lelemi Lenje Lewo Liangmai Naga Liberia Kpelle Limbu Limbum Limos Kalinga Lingala Lisu Literary Chinese Lithuanian Lobala Lobi Logo Lokaa Loko Lole Lolopo Loma (Liberia) Lote Low German Lower Grand Valley Dani Lowland Tarahumara Lozi Luang Lugbara Luguru Lukpa Lundayeh Luo (Kenya and Tanzania) Lutos Luwo Luxembourgish Lyélé Ma'anyan Ma'di Maasina Fulfulde Mabaan Maca Machame Machiguenga Macuna Macushi Mada (Nigeria) Madak Madurese Mafa Mag-antsi Ayta Magahi Magdalena Peñasco Mixtec Maiadomu Maithili Maiwa (Papua New Guinea) Majukayang Kalinga Mak (Nigeria) Makaa Makasar Makhuwa Makhuwa-Meetto Makonde Malagasy Malawi Lomwe Malay Malay (individual language) Malayalam Malba Birifor Male (Ethiopia) Maltese Malvi Mam Mamanwa Mamara Senoufo Mamasa Mambwe-Lungu Mampruli Manado Malay Manam Mandarin Chinese Mandinka Manga Kanuri Mangga Buang Manggarai Mangseng Manikion Mankanya Mansaka Manx Mape Mapos Buang Mapuche Maram Naga Maranao Marathi Marba Margos-Yarowilca-Lauricocha Quechua Marik Maring Naga Markweeta Marshallese Maru Masaaba Masai Masana Masbatenyo Maskelynes Matal Matigsalug Manobo Mato Matsés Matu Chin Maung Mauwake Maxakalí Mayangna Mayo Mayoyao Ifugao Mazatlán Mixe Mbay Mbuko Mbula Mbunda Mbyá Guaraní Mekeo Melpa Mende Mende (Papua New Guinea) Mengen Mentawai Merey Mesopotamian Arabic Metaʼ Metlatónoc Mixtec Meyah Mezquital Otomi Mi'kmaw Miahuatlán Zapotec Mian Michoacán Nahuatl Middle Newar Migabac Min Nan Chinese Minangkabau Minaveha Minica Huitoto Mirandese Misima-Panaeati Mising Mitla Zapotec Mixtepec Zapotec Miyobe Mizo Moba Mochi Mocoví Mofu-Gudur Mogofin Mohawk Mokole Molima Mongo Mongondow Mono (Democratic Republic of Congo) Monsang Naga Moose Cree Mopán Maya Morisyen Moro Moroccan Arabic Morokodo Moru Moskona Mossi Motu Mountain Koiali Moyon Naga Mro-Khimi Chin Mufian Muinane Mum Mumuye Muna Mundang Mundani Mundurukú Murle Murrinh-Patha Murui Huitoto Musey Musgu Mussau-Emira Mutu Muyang Muyuw Mwaghavul Mwan Mwani Ménik Mískito Mün Chin Mündü Naasioi Nabak Nadëb Nafaanra Nakanai Nalca Nali Nama Namiae Nande Napo Lowland Quechua Napu Naro Nateni Natügu Navajo Nawdm Nawuri Ndamba Ndau Ndogo Ndonga Nehan Nek Nepali Nepali (individual language) Newari Ngaju Ngalum Ngambay Ngando (Democratic Republic of Congo) Ngangam Ngawn Chin Ngiemboon Ngindo Ngiti Ngombe (Democratic Republic of Congo) Ngulu Ngäbere Nias Nigeria Mambila Nigerian Fulfulde Nigerian Pidgin Nii Nilamba Nimoa Ninzo Nivaclé Nkonya Nobonob Nocte Naga Nogai Nomaande Nomatsiguenga Noon Noone Nopala Chatino North Alaskan Inupiatun North Azerbaijani North Bolivian Quechua North Junín Quechua North Mesopotamian Arabic North Mofu North Ndebele North Tairora North Tanna Northeastern Dinka Northern Bobo Madaré Northern Conchucos Ancash Quechua Northern Dagara Northern East Cree Northern Emberá Northern Grebo Northern Kankanay Northern Katang Northern Khmer Northern Kissi Northern Kurdish Northern Oaxaca Nahuatl Northern Paiute Northern Pastaza Quichua Northern Puebla Nahuatl Northern Rengma Naga Northern Sami Northern Sotho Northern Tepehuan Northern Thai Northern Tlaxiaco Mixtec Northern Uzbek Northwest Alaska Inupiatun Northwest Gbaya Northwestern Ojibwa Norwegian Norwegian Bokmål Norwegian Nynorsk Nsenga Ntcham Nuer Nugunu (Cameroon) Nukna Numanggang Nunggubuyu Nyabwa Nyakyusa-Ngonde Nyanja Nyankole Nyaturu Nyindrou Nyishi Nyole Nyoro Nzima Obo Manobo Obolo Ocotepec Mixtec Ocotlán Zapotec Odia Odia (individual language) Odoodee Ogea Oji-Cree Ojitlán Chinantec Oksapmin Oku Olo Ono Orokaiva Orya Ossetic Owa Ozolotepec Zapotec Ozumacín Chinantec Paama Paasaal Paicî Paite Chin Palantla Chinantec Palauan Palikúr Pamona Pampanga Pamplona Atta Panao Huánuco Quechua Pangasinan Panoan Katukína Papantla Totonac Papiamento Paraguayan Guaraní Paranan Parauk Parecís Parkwa Pashto Patamona Patep Patpatar Paumarí Pele-Ata Pennsylvania German Pere Persian (Afghanistan) Peñoles Mixtec Phom Naga Piapoco Pichis Ashéninka Pijin Pilagá Pinotepa Nacional Mixtec Pinyin Piratapuyo Pisaflores Tepehua Pitjantjatjara Plains Cree Plapo Krumen Plateau Malagasy Plautdietsch Pochuri Naga Pogolo Pohnpeian Pokomo Polish Popti' Poqomchi' Portuguese Potawatomi Poumei Naga Psikye Puinave Pular Punjabi Purepecha Pwo Northern Karen Páez Pévé Q'anjob'al Qaqet Querétaro Otomi Quetzaltepec Mixe Quioquitani-Quierí Zapotec Quiotepec Chinantec Rade Rajbanshi Ramoaaina Ranglong Rapanui Rarotongan Rawa Rawang Rejang Rendille Riang (India) Rigwe Rikbaktsa Rinconada Bikol Rincón Zapotec Ringgou Romblomanon Rongmei Naga Rotokas Roviana Rukai Rundi Russia Buriat Russian Sa'a Saamia Sabaot Sabu Sadri Safeyoka Sahu Saint Lucian Creole French Salasaca Highland Quichua Saliba Salt-Yui Samba Leko Sambal Samberigi Samoan Sampang San Blas Kuna San Jerónimo Tecóatl Mazatec San Juan Atzingo Popoloca San Juan Colorado Mixtec San Luís Temalacayuca Popoloca San Marcos Tlacoyalco Popoloca San Martín Itunyoso Triqui San Martín Quechua San Mateo Del Mar Huave San Miguel El Grande Mixtec San Pedro Amuzgos Amuzgo San Vicente Coatlán Zapotec Sanapaná Sangir Sango Sangtam Naga Saniyo-Hiyewe Sankaran Maninka Sanskrit Santa María Quiegolani Zapotec Santa María Zacatepec Mixtec Santa Teresa Cora Santali Santo Domingo Albarradas Zapotec Sanumá Saposa Sar Saramaccan Sarangani Blaan Sarangani Manobo Sasak Sateré-Mawé Sayula Popoluca Scots Scottish Gaelic Sea Island Creole English Sebat Bet Gurage Secoya Sedoa Seimat Sekpele Selee Selepet Sena Sepik Iwam Serbian Seselwa Creole French Shambala Shan Sharanahua Sherpa Shilluk Shipibo-Conibo Shona Shuar Siane Sierra Negra Nahuatl Sierra de Juárez Zapotec Siksiká Silacayoapan Mixtec Simte Sinaugoro Sindhi Sinhala Sinte Romani Sio Siona Siriano Sirionó Siroi Sissala Siwu Siyin Chin Slovenian Sochiapam Chinantec Soga Somali Somba-Siawari Songe Sop South Azerbaijani South Bolivian Quechua South Fali South Giziga South Ndebele South Tairora South Ucayali Ashéninka Southeast Ambrym Southeastern Dinka Southeastern Puebla Nahuatl Southeastern Tepehuan Southern Altai Southern Balochi Southern Birifor Southern Bobo Madaré Southern Carrier Southern Conchucos Ancash Quechua Southern Dagaare Southern East Cree Southern Ghale Southern Kalinga Southern Kisi Southern Nambikuára Southern Nuni Southern Pastaza Quechua Southern Puebla Mixtec Southern Rengma Naga Southern Rincon Zapotec Southern Samo Southern Sotho Southern Toussian Southwest Gbaya Southwest Tanna Southwestern Dinka Spanish Sranan Tongo Standard Arabic Standard Estonian Standard Latvian Standard Malay Suau Suba Sudanese Arabic Sudest Suena Sukuma Sulka Sumi Naga Sundanese Sunwar Supyire Senoufo Sursurunga Susu Swabian Swahili (individual language) Swati Swedish Swiss German Sylheti Taabwa Tabaa Zapotec Tabaru Tabasco Chontal Tabassaran Tabo Tacana Tachelhit Tagabawa Tagbanwa Tai Dam Tajik Takia Takuu Takwane Talinga-Bwisi Tamasheq Tamil Tampulma Tangkhul Naga (India) Tangoa Tanimuca-Retuarã Tarao Naga Tase Naga Tataltepec Chatino Tatar Tatuyo Taupota Tausug Tawala Tawallammat Tamajaq Tboli Tecpatlán Totonac Tedim Chin Tektiteko Telefol Telugu Tem Tena Lowland Quichua Tenango Otomi Tenharim Tepetotutla Chinantec Tepeuxila Cuicatec Tepo Krumen Tera Tereno Teribe Termanu Teso Tetelcingo Nahuatl Tetum Tetun Dili Teutila Cuicatec Tewa (USA) Texmelucan Zapotec Tezoatlán Mixtec Thado Chin Thai Thangal Naga Tharaka Ticuna Tifal Tigon Mbembe Tigrinya Tii Tikar Timbe Timne Timugon Murut Tinputz Tiruray Tlachichilco Tepehua Tlahuitoltepec Mixe Tlamacazapa Nahuatl Toaripi Toba Tobelo Tohono O'odham Tojolabal Tok Pisin Tol Toma Tombonuo Tonga (Zambia) Tongan Toraja-Sa'dan Toro So Dogon Torres Strait Creole Tosk Albanian Totontepec Mixe Toura (Côte d'Ivoire) Trinitario Trió Tsikimba Tsishingini Tsonga Tswana Tucano Tula Tuma-Irumu Tumak Tumbuka Tumulung Sisaala Tungag Tunisian Arabic Tupuri Turkish Turkmen Tuvinian Tuwali Ifugao Tuwuli Tuyuca Twi Tyap Tz'utujil Tzeltal Tzotzil Uab Meto Uare Ubir Ucayali-Yurúa Ashéninka Udmurt Uduk Ukrainian Uma Umanakaina Umbu-Ungu Umiray Dumaget Agta Una Upper Necaxa Totonac Urak Lawoi' Urarina Urat Urdu Uri Urim Uripiv-Wala-Rano-Atchin Urubú-Kaapor Usan Usarufa Usila Chinantec Uspanteco Uyghur Vagla Vaiphei Venda Vengo Vidunda Vietnamese Vlax Romani Volapük Vunjo Vute Wa Waama Waffa Wagi Wahau Kenyah Waima Waimaha Wala Wali (Ghana) Walmajarri Wamey Wancho Naga Wandala Wantoat Waorani Wapishana Warao Waray Waris Warlpiri Waskia Wayampi Wayana Wayuu Wedau Weri West Kewa West-Central Limba Western Apache Western Arrarnta Western Bolivian Guaraní Western Bukidnon Manobo Western Canadian Inuktitut Western Dani Western Highland Chatino Western Highland Purepecha Western Huasteca Nahuatl Western Kanjobal Western Kayah Western Lawa Western Niger Fulfulde Western Penan Western Subanon Western Tawbuid Western Tlacolula Valley Zapotec Whitesands Wichí Lhamtés Güisnay Wichí Lhamtés Nocten Wik-Mungkan Wipi Wiru Wolaytta Wolof Woun Meu Wuvulu-Aua Wè Northern Xaasongaxango Xavánte Xhosa Xicotepec De Juárez Totonac Yagua Yakan Yakut Yalunka Yalálag Zapotec Yamba Yambeta Yaminahua Yanesha' Yanomamö Yao Yaouré Yapese Yaqui Yareba Yareni Zapotec Yatee Zapotec Yatzachi Zapotec Yau (Morobe Province) Yawa Yaweyuha Yele Yemba Yessan-Mayo Yimchungru Naga Yine Yocoboué Dida Yombe Yongbei Zhuang Yongkom Yopno Yoruba Yosondúa Mixtec Yucateco Yucuna Zacatlán-Ahuacatlán-Tepetzintla Nahuatl Zaiwa Zarma Zemba Zeme Naga Zia Zigula Zoogocho Zapotec Zotung Chin Zou Zulgo-Gemzek Zulu Zyphe Chin acc bod ces ct1 cym daf deu dud ell eus fas fra hye isl ixi kat khi kzj leg mf1 mkd mri msa mvc mya nah nld ron sdm slk sqi sum tzt zho Ömie
Availability:
From authors
License:
Size:
Up to 1108 Bibles OtherProduction Status:
Existing-updated
Use:
Machine Translation, SpeechToSpeech Translation
-
Paper title:An Analysis of Massively Multilingual Neural Machine Translation for Low-Resource Languages
-
Paper track:Written/poster presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Aaron Mueller | Johns Hopkins University Bible Corpus | /N |
Documentation:
None
Written
Evaluation Tool,
Language Type:
Multilingual
Languages:
English French German Hebrew Russian
Availability:
Freely Available
License:
Apache License, Version 2.0
Size:
62 MByte Production Status:
Newly created-in progress
Use:
Syntactic Evaluation (and Evaluation Set Generators)
-
Paper title:Cross-Linguistic Syntactic Evaluation of Word Prediction Models
-
Paper track:Long/Interpretability and Analysis of Models for NLP
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Aaron Mueller | CLAMS: Cross-Linguistic Assessment of Models on Syntax | /N |
Documentation:
README.md on Github repository in English




